Goto

Collaborating Authors

 advanced computer science and application


Person Re-Identification System at Semantic Level based on Pedestrian Attributes Ontology

Ly, Ngoc Q., Cao, Hieu N. M., Nguyen, Thi T.

arXiv.org Artificial Intelligence

Person Re-Identification (Re-ID) is a very important task in video surveillance systems such as tracking people, finding people in public places, or analysing customer behavior in supermarkets. Although there have been many works to solve this problem, there are still remaining challenges such as large-scale datasets, imbalanced data, viewpoint, fine grained data (attributes), the Local Features are not employed at semantic level in online stage of Re-ID task, furthermore, the imbalanced data problem of attributes are not taken into consideration. This paper has proposed a Unified Re-ID system consisted of three main modules such as Pedestrian Attribute Ontology (PAO), Local Multi-task DCNN (Local MDCNN), Imbalance Data Solver (IDS). The new main point of our Re-ID system is the power of mutual support of PAO, Local MDCNN and IDS to exploit the inner-group correlations of attributes and pre-filter the mismatch candidates from Gallery set based on semantic information as Fashion Attributes and Facial Attributes, to solve the imbalanced data of attributes without adjusting network architecture and data augmentation. We experimented on the well-known Market1501 dataset. The experimental results have shown the effectiveness of our Re-ID system and it could achieve the higher performance on Market1501 dataset in comparison to some state-of-the-art Re-ID methods.


A Comprehensive Study on Medical Image Segmentation using Deep Neural Networks

Dao, Loan, Ly, Ngoc Quoc

arXiv.org Artificial Intelligence

Over the past decade, Medical Image Segmentation (MIS) using Deep Neural Networks (DNNs) has achieved significant performance improvements and holds great promise for future developments. This paper presents a comprehensive study on MIS based on DNNs. Intelligent Vision Systems are often evaluated based on their output levels, such as Data, Information, Knowledge, Intelligence, and Wisdom (DIKIW),and the state-of-the-art solutions in MIS at these levels are the focus of research. Additionally, Explainable Artificial Intelligence (XAI) has become an important research direction, as it aims to uncover the "black box" nature of previous DNN architectures to meet the requirements of transparency and ethics. The study emphasizes the importance of MIS in disease diagnosis and early detection, particularly for increasing the survival rate of cancer patients through timely diagnosis. XAI and early prediction are considered two important steps in the journey from "intelligence" to "wisdom." Additionally, the paper addresses existing challenges and proposes potential solutions to enhance the efficiency of implementing DNN-based MIS.


Recent Advances in Medical Image Classification

Dao, Loan, Ly, Ngoc Quoc

arXiv.org Artificial Intelligence

Medical image classification is crucial for diagnosis and treatment, benefiting significantly from advancements in artificial intelligence. The paper reviews recent progress in the field, focusing on three levels of solutions: basic, specific, and applied. It highlights advances in traditional methods using deep learning models like Convolutional Neural Networks and Vision Transformers, as well as state-of-the-art approaches with Vision Language Models. These models tackle the issue of limited labeled data, and enhance and explain predictive results through Explainable Artificial Intelligence.


Investigating Retrieval-Augmented Generation in Quranic Studies: A Study of 13 Open-Source Large Language Models

Khalila, Zahra, Nasution, Arbi Haza, Monika, Winda, Onan, Aytug, Murakami, Yohei, Radi, Yasir Bin Ismail, Osmani, Noor Mohammad

arXiv.org Artificial Intelligence

Accurate and contextually faithful responses are critical when applying large language models (LLMs) to sensitive and domain-specific tasks, such as answering queries related to quranic studies. General-purpose LLMs often struggle with hallucinations, where generated responses deviate from authoritative sources, raising concerns about their reliability in religious contexts. This challenge highlights the need for systems that can integrate domain-specific knowledge while maintaining response accuracy, relevance, and faithfulness. In this study, we investigate 13 open-source LLMs categorized into large (e.g., Llama3:70b, Gemma2:27b, QwQ:32b), medium (e.g., Gemma2:9b, Llama3:8b), and small (e.g., Llama3.2:3b, Phi3:3.8b). A Retrieval-Augmented Generation (RAG) is used to make up for the problems that come with using separate models. This research utilizes a descriptive dataset of Quranic surahs including the meanings, historical context, and qualities of the 114 surahs, allowing the model to gather relevant knowledge before responding. The models are evaluated using three key metrics set by human evaluators: context relevance, answer faithfulness, and answer relevance. The findings reveal that large models consistently outperform smaller models in capturing query semantics and producing accurate, contextually grounded responses. The Llama3.2:3b model, even though it is considered small, does very well on faithfulness (4.619) and relevance (4.857), showing the promise of smaller architectures that have been well optimized. This article examines the trade-offs between model size, computational efficiency, and response quality while using LLMs in domain-specific applications.


Understanding Mental Health Content on Social Media and Its Effect Towards Suicidal Ideation

Bhuiyan, Mohaiminul Islam, Kamarudin, Nur Shazwani, Ismail, Nur Hafieza

arXiv.org Artificial Intelligence

This review underscores the critical need for effective strategies to identify and support individuals with suicidal ideation, exploiting technological innovations in ML and DL to further suicide prevention efforts. The study details the application of these technologies in analyzing vast amounts of unstructured social media data to detect linguistic patterns, keywords, phrases, tones, and contextual cues associated with suicidal thoughts. It explores various ML and DL models like SVMs, CNNs, LSTM, neural networks, and their effectiveness in interpreting complex data patterns and emotional nuances within text data. The review discusses the potential of these technologies to serve as a life-saving tool by identifying at-risk individuals through their digital traces. Furthermore, it evaluates the real-world effectiveness, limitations, and ethical considerations of employing these technologies for suicide prevention, stressing the importance of responsible development and usage. The study aims to fill critical knowledge gaps by analyzing recent studies, methodologies, tools, and techniques in this field. It highlights the importance of synthesizing current literature to inform practical tools and suicide prevention efforts, guiding innovation in reliable, ethical systems for early intervention. This research synthesis evaluates the intersection of technology and mental health, advocating for the ethical and responsible application of ML, DL, and NLP to offer life-saving potential worldwide while addressing challenges like generalizability, biases, privacy, and the need for further research to ensure these technologies do not exacerbate existing inequities and harms.


Movement Control of Smart Mosque's Domes using CSRNet and Fuzzy Logic Techniques

Blasi, Anas H., Lababede, Mohammad Awis Al, Alsuwaiket, Mohammed A.

arXiv.org Artificial Intelligence

Mosques are worship places of Allah and must be preserved clean, immaculate, provide all the comforts of the worshippers in them. The prophet's mosque in Medina/ Saudi Arabia is one of the most important mosques for Muslims. It occupies second place after the sacred mosque in Mecca/ Saudi Arabia, which is in constant overcrowding by all Muslims to visit the prophet Mohammad's tomb. This paper aims to propose a smart dome model to preserve the fresh air and allow the sunlight to enter the mosque using artificial intelligence techniques. The proposed model controls domes movements based on the weather conditions and the overcrowding rates in the mosque. The data have been collected from two different resources, the first one from the database of Saudi Arabia weather's history, and the other from Shanghai Technology Database. Congested Scene Recognition Network (CSRNet) and Fuzzy techniques have applied using Python programming language to control the domes to be opened and closed for a specific time to renew the air inside the mosque. Also, this model consists of several parts that are connected for controlling the mechanism of opening/closing domes according to weather data and the situation of crowding in the mosque. Finally, the main goal of this paper has been achieved, and the proposed model has worked efficiently and specifies the exact duration time to keep the domes open automatically for a few minutes for each hour head.


A Cost-Efficient Approach for Creating Virtual Fitting Room using Generative Adversarial Networks (GANs)

Attallah, Kirolos, Zaky, Girgis, Abdelrhim, Nourhan, Botros, Kyrillos, Dife, Amjad, Negied, Nermin

arXiv.org Artificial Intelligence

Customers all over the world want to see how the clothes fit them or not before purchasing. Therefore, customers by nature prefer brick-and-mortar clothes shopping so they can try on products before purchasing them. But after the Pandemic of COVID19 many sellers either shifted to online shopping or closed their fitting rooms which made the shopping process hesitant and doubtful. The fact that the clothes may not be suitable for their buyers after purchase led us to think about using new AI technologies to create an online platform or a virtual fitting room (VFR) in the form of a mobile application and a deployed model using a webpage that can be embedded later to any online store where they can try on any number of cloth items without physically trying them. Besides, it will save much searching time for their needs. Furthermore, it will reduce the crowding and headache in the physical shops by applying the same technology using a special type of mirror that will enable customers to try on faster. On the other hand, from business owners' perspective, this project will highly increase their online sales, besides, it will save the quality of the products by avoiding physical trials issues. The main approach used in this work is applying Generative Adversarial Networks (GANs) combined with image processing techniques to generate one output image from two input images which are the person image and the cloth image. This work achieved results that outperformed the state-of-the-art approaches found in literature.


Adversarial Sampling for Fairness Testing in Deep Neural Network

Ige, Tosin, Marfo, William, Tonkinson, Justin, Adewale, Sikiru, Matti, Bolanle Hafiz

arXiv.org Artificial Intelligence

In this research, we focus on the usage of adversarial sampling to test for the fairness in the prediction of deep neural network model across different classes of image in a given dataset. While several framework had been proposed to ensure robustness of machine learning model against adversarial attack, some of which includes adversarial training algorithm. There is still the pitfall that adversarial training algorithm tends to cause disparity in accuracy and robustness among different group. Our research is aimed at using adversarial sampling to test for fairness in the prediction of deep neural network model across different classes or categories of image in a given dataset. We successfully demonstrated a new method of ensuring fairness across various group of input in deep neural network classifier. We trained our neural network model on the original image, and without training our model on the perturbed or attacked image. When we feed the adversarial samplings to our model, it was able to predict the original category/ class of the image the adversarial sample belongs to. We also introduced and used the separation of concern concept from software engineering whereby there is an additional standalone filter layer that filters perturbed image by heavily removing the noise or attack before automatically passing it to the network for classification, we were able to have accuracy of 93.3%. Cifar-10 dataset have ten categories of dataset, and so, in order to account for fairness, we applied our hypothesis across each categories of dataset and were able to get a consistent result and accuracy.


Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification

Mustapha, Ismail. B., Hasan, Shafaatunnur, Nabbus, Hatem S Y, Montaser, Mohamed Mostafa Ali, Olatunji, Sunday Olusanya, Shamsuddin, Siti Maryam

arXiv.org Artificial Intelligence

Oversampling and undersampling are two common data resampling approaches used in DNN. Owing to increased data availability, novel learning However, the susceptibility of the former to noise and architectures and accessibility to commodity computational overfitting due to added samples [23] as well as the hardware devices, deep neural networks (DNNs) have become characteristic loss of valuable information peculiar with the the de facto tool for a wide range of machine learning (ML) latter [3] remain major drawbacks of this category of tasks in recent times; leading to state-of-the-art performance in imbalance methods. On the other hand, the core idea behind several computer vision, natural language processing and the cost sensitive methods is to assign different speech recognition tasks. DNNs are characterized by several misclassification cost/weights to the training samples to scale layers of hidden units that enable learning of useful up/down the misclassification errors depending on the class representations of a given data for improved model they belong [17, 24]. While there are several implementations performance [1, 2]. This alleviates the need for domain experts of this method, the most commonly used cost sensitive and hand-engineered features, a common prerequisite for approach in imbalanced deep learning research is reweighting traditional ML methods.


An Intelligent Decision Support Ensemble Voting Model for Coronary Artery Disease Prediction in Smart Healthcare Monitoring Environments

Maach, Anas, Elalami, Jamila, Elalami, Noureddine, Mazoudi, El Houssine El

arXiv.org Artificial Intelligence

Coronary artery disease (CAD) is one of the most common cardiac diseases worldwide and causes disability and economic burden. It is the world's leading and most serious cause of mortality, with approximately 80% of deaths reported in low- and middle-income countries. The preferred and most precise diagnostic tool for CAD is angiography, but it is invasive, expensive, and technically demanding. However, the research community is increasingly interested in the computer-aided diagnosis of CAD via the utilization of machine learning (ML) methods. The purpose of this work is to present an e-diagnosis tool based on ML algorithms that can be used in a smart healthcare monitoring system. We applied the most accurate machine learning methods that have shown superior results in the literature to different medical datasets such as RandomForest, XGboost, MLP, J48, AdaBoost, NaiveBayes, LogitBoost, KNN. Every single classifier can be efficient on a different dataset. Thus, an ensemble model using majority voting was designed to take advantage of the well-performed single classifiers, Ensemble learning aims to combine the forecasts of multiple individual classifiers to achieve higher performance than individual classifiers in terms of precision, specificity, sensitivity, and accuracy; furthermore, we have benchmarked our proposed model with the most efficient and well-known ensemble models, such as Bagging, Stacking methods based on the cross-validation technique, The experimental results confirm that the ensemble majority voting approach based on the top 3 classifiers: MultilayerPerceptron, RandomForest, and AdaBoost, achieves the highest accuracy of 88,12% and outperforms all other classifiers. This study demonstrates that the majority voting ensemble approach proposed above is the most accurate machine learning classification approach for the prediction and detection of coronary artery disease.